-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Slice output blocks to respect target block size #40248
[data] Slice output blocks to respect target block size #40248
Conversation
Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix. Have you run the benchmarks? Would like to learn the perf impact.
Good idea, will do this. |
Did some spot checks on the single-node performance benchmarks, and seems like there's on obvious difference. |
…project#40248)" This reverts commit d5f1eed.
#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes #40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
ray-project#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes ray-project#40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes #40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
This addresses #40759 and #38400 for the 2.8 release branch. This change OR reverting #40248 seems to fix #40759, but the root cause has not been identified yet. For #38400, we will merge a longer-term fix to master for 2.9. This PR should be safe since it reverts Data block size back to the 2.7 default. Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
With #40248, block sizes are now respected. This increases the default shuffle block size to 1GiB, which restores the previous behavior in the release test dataset_shuffle_sort_1tb. There is a possibility that this increases worker heap memory pressure during shuffle operations, but it can be resolved by overriding DataContext. Related issue number Closes #38400. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
With ray-project#40248, block sizes are now respected. This increases the default shuffle block size to 1GiB, which restores the previous behavior in the release test dataset_shuffle_sort_1tb. There is a possibility that this increases worker heap memory pressure during shuffle operations, but it can be resolved by overriding DataContext. Related issue number Closes ray-project#38400. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
Why are these changes needed?
This slices a task's output blocks to ensure that we respect the target max block size. This can cause a performance penalty for cases where the batch size is misaligned with the output block size, but this is necessary for stability and can be optimized later (by auto-choosing a better batch size).
Related issue number
#40026.
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.